Cross-Lingual Morphological Tagging for Low-Resource Languages

机译：低资源语言的跨语言形态标记

代理获取

本网站仅为用户提供外文OA文献查询和代理获取服务，本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文，但由于OA文献来源多样且变更频繁，仍可能出现获取不到、文献不完整或与标题不符等情况，如果获取不到我们将提供退款服务。请知悉。

页面导航

摘要
著录项
相似文献
相关主题

摘要

Morphologically rich languages often lack the annotated linguistic resourcesrequired to develop accurate natural language processing tools. We proposemodels suitable for training morphological taggers with rich tagsets forlow-resource languages without using direct supervision. Our approach extendsexisting approaches of projecting part-of-speech tags across languages, usingbitext to infer constraints on the possible tags for a given word type ortoken. We propose a tagging model using Wsabie, a discriminativeembedding-based model with rank-based learning. In our evaluation on 11languages, on average this model performs on par with a baselineweakly-supervised HMM, while being more scalable. Multilingual experiments showthat the method performs best when projecting between related language pairs.Despite the inherently lossy projection, we show that the morphological tagspredicted by our models improve the downstream performance of a parser by +0.6LAS on average.

机译：形态丰富的语言通常缺少开发准确的自然语言处理工具所需的带注释的语言资源。我们提出了适合在不使用直接监督的情况下针对低资源语言使用丰富标签集来训练形态标签的模型。我们的方法扩展了跨语言投影词性标签的现有方法，使用bitext推断给定单词类型或令牌的可能标签的约束。我们提出使用Wsabie的标记模型，这是一种基于判别式嵌入的模型，具有基于排名的学习方法。在我们对11种语言的评估中，该模型的平均性能与基线弱监督的HMM相当，同时具有更高的可扩展性。多语言实验表明，该方法在相关语言对之间进行投影时效果最佳。尽管存在固有的有损投影，但我们表明，我们的模型预测的形态学标记将解析器的下游性能平均提高了+ 0.6LAS。

著录项

作者
Buys, Jan; Botha, Jan A.;
展开▼
作者单位

展开▼
年度 2016
总页数
原文格式 PDF
正文语种
中图分类

相似文献

外文文献
中文文献
专利

1. NeuMorph: Neural Morphological Tagging for Low-Resource Languages-An Experimental Study for Indie Languages [J] . Chakrabarty Abhisek, Chaturvedi Akshay, Garain Utpal ACM transactions on Asian language information processing . 2020,第1期

机译：NeuMorph：低资源语言的神经形态学标记-独立语言的实验研究
2. Learning cross-lingual phonological and orthagraphic adaptations: a case study in improving neural machine translation between low-resource languages [J] . Saurav Jha, Akhilesh Sudhakar, Anil Kumar Singh Journal of Language Modelling . 2019,第2期

机译：学习跨语言的语音和拼字法适应：改进低资源语言之间的神经机器翻译的案例研究
3. Automatic Wordnet Development for Low-Resource Languages using Cross-Lingual WSD [J] . Faili Hesham, Taghizadeh Nasrin The Journal of Artificial Intelligence Research . 2016,第10期

机译：使用跨语言WSD的低资源语言自动Wordnet开发
4. Cross-Lingual Morphological Tagging for Low-Resource Languages [C] . Jan Buys, Jan A. Botha Annual meeting of the Association for Computational Linguistics . 2016

机译：低资源语言的跨语言形态标记
5. Learning Deep Representations for Low-resource Cross-lingual Natural Language Processing [D] . Chen, Xilun. 2019

机译：学习深度表示资源少的跨语言自然语言处理
6. Cross-lingual Unified Medical Language System entity linking in online health communities [O] . Yonatan Bitton, Raphael Cohen, Tamar Schifter, 2020

机译：在线健康社区中链接的跨语言统一医疗语言系统实体
7. Learning cross-lingual phonological and orthagraphic adaptations: a case study in improving neural machine translation between low-resource languages [O] . Saurav Jha, Akhilesh Sudhakar, Anil Kumar Singh 2019

机译：学习交叉语音语音和矫形矫正适应性：在改进低资源语言中神经机翻译的案例研究

Cross-Lingual Morphological Tagging for Low-Resource Languages

摘要

著录项

相似文献

相关主题

期刊订阅